Exploratory tools for clustering multivariate data
نویسندگان
چکیده
The forward search provides a series of robust parameter estimates based on increasing numbers of observations. The resulting series of robustMahalanobis distances is used to clustermultivariate normal data.Themethod depends on envelopes of the distribution of the test statistics in forward plots.These envelopes can be found by simulation; flexible polynomial approximations to the envelopes are given. New graphical tools providemethods not only of detecting clusters but also of determining their membership. Comparisons are made with mclust and k-means clustering. © 2007 Elsevier B.V. All rights reserved.
منابع مشابه
Interpretability and Informativeness of Clustering Methods for Exploratory Analysis of Clinical Data
Clustering methods are among the most commonly used tools for exploratory data analysis. However, using clustering to perform data analysis can be challenging for modern datasets that contain a large number of dimensions, are complex in nature, and lack a ground-truth labeling. Traditional tools, like summarization and plotting of clusters, are of limited benefit in a high-dimensional setting. ...
متن کاملIncorporating Density Estimationinto Other Exploratory
Preliminary understanding of a new data set is routinely accomplished with graphical tools, such as those popularized originally by EDA. A number of more recent ideas for multivariate data analysis have emerged and some are available in software packages or shareware such as XGobi. In this talk, we illustrate how many of the point-oriented techniques can be supplemented by incorporating nonpara...
متن کاملModel-Based Clustering and Classification of Functional Data
The problem of complex data analysis is a central topic of modern statistical science and learning systems and is becoming of broader interest with the increasing prevalence of highdimensional data. The challenge is to develop statistical models and autonomous algorithms that are able to acquire knowledge from raw data for exploratory analysis, which can be achieved through clustering technique...
متن کاملAn Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کاملA growth curve approach to analyzing multiple-valued expression data
There is significant literature which explores methods for clustering timeseries gene-expression data sets, such as the classical data set due to Spellman et al. (1998). For instance James and Hastie (2001) use linear or quadratic discriminant functions on fitted curves, while Bar-Joseph et al. (2003) using a similar approach, do the clustering based on the coefficients of the fitted splines. I...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 52 شماره
صفحات -
تاریخ انتشار 2007